Chapter 12 Time-derivative Models of Pavlovian Reinforcement

نویسندگان

Richard S. Sutton

Andrew G. Barto

چکیده

This chapter presents a model of classical conditioning called the temporal-diierence (TD) model. The TD model was originally developed as a neuron-like unit for use in adaptive networks (Sutton and Barto 1987; Sutton 1984; Barto, Sutton and Anderson 1983). In this paper, however, we analyze it from the point of view of animal learning theory. Our intended audience is both animal learning researchers interested in computational theories of behavior and machine learning researchers interested in how their learning algorithms relate to, and may be constrained by, animal learning studies. For an exposition of the TD model from an engineering point of view, see Chapter 13 of this volume. We focus on what we see as the primary theoretical contribution to animal learning theory of the TD and related models: the hypothesis that reinforcement in classical conditioning is the time derivative of a composite association combining innate (US) and acquired (CS) associations. We call models based on some variant of this hypothesis time-derivative models , examples of which are the models we examine several of these models in relation to the TD model. We also brieey explore relationships with animal learning theories of reinforcement, including Mowrer's drive-induction theory (Mowrer 1960) and the Rescorla-Wagner model (Rescorla and Wagner 1972). Although the Rescorla-Wagner model is not a time-derivative model, it plays a central role in our exposition because it is well-known and successful both as an animal learning model and as an adaptive-network learning

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Time-Derivative Models of Pavlovian Reinforcement

This chapter presents a model of classical conditioning called the temporal-difference (TD) model. The TD model was originally developed as a neuron-like unit for use in adaptive networks (Sutton and Barto 1987; Sutton 1984; Barto, Sutton and Anderson 1983). In this paper, however, we analyze it from the point of view of animal learning theory. Our intended audience is both animal learning rese...

متن کامل

The misbehavior of value and the discipline of the will

Most reinforcement learning models of animal conditioning operate under the convenient, though fictive, assumption that Pavlovian conditioning concerns prediction learning whereas instrumental conditioning concerns action learning. However, it is only through Pavlovian responses that Pavlovian prediction learning is evident, and these responses can act against the instrumental interests of the ...

متن کامل

Neuropsychology of reinforcement processes in the rat

12 Abstract This thesis investigated the role played by regions of the prefrontal cortex and ventral striatum in the control of rats’ behaviour by Pavlovian conditioned stimuli, and in their capacity to choose delayed reinforcement. First, the function of the anterior cingulate cortex (ACC) in simple Pavlovian conditioning tasks was addressed. The ACC is a subdivision of prefrontal cortex that ...

متن کامل

Pavlovian conditioning from a foraging perspective

Principles in foraging and standard associative learning theories motivate a model for Pavlovian conditioning. The model tracks distal and proximal scales of expected reward probabilities plus the strength of signal-reward association. A combined reward probability is developed by combining the distal and proximal estimates through their uncertainties. Possible neural structure equivalents to t...

متن کامل

Reduction of Pavlovian Bias in Schizophrenia: Enhanced Effects in Clozapine-Administered Patients.

The negative symptoms of schizophrenia (SZ) are associated with a pattern of reinforcement learning (RL) deficits likely related to degraded representations of reward values. However, the RL tasks used to date have required active responses to both reward and punishing stimuli. Pavlovian biases have been shown to affect performance on these tasks through invigoration of action to reward and inh...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1990

Chapter 12 Time-derivative Models of Pavlovian Reinforcement

نویسندگان

چکیده

منابع مشابه

Time-Derivative Models of Pavlovian Reinforcement

The misbehavior of value and the discipline of the will

Neuropsychology of reinforcement processes in the rat

Pavlovian conditioning from a foraging perspective

Reduction of Pavlovian Bias in Schizophrenia: Enhanced Effects in Clozapine-Administered Patients.

عنوان ژورنال:

اشتراک گذاری